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Abstract — In this paper, we design interference free 
transceivers for multi-user two-way relay systems, where a multi- 
antenna base station (BS) simultaneously exchanges information 
with multiple single-antenna users via a multi-antenna amplify- 
and-forward relay station (RS). To offer a performance bench- 
mark and provide useful insight into the transceiver structure, 
we employ alternating optimization to find optimal transceivers 
at the BS and RS that maximizes the bidirectional sum rate. 
We then propose a low complexity scheme, where the BS 
transceiver is the zero-forcing precoder and detector, and the RS 
transceiver is designed to balance the uplink and downlink sum 
rates. Simulation results demonstrate that the proposed scheme 
is superior to the existing zero forcing and signal alignment 
schemes, and the performance gap between the proposed scheme 
and the alternating optimization is minor. 

Index Terms — Two-way relay, multi-user, multi-antenna, 
transceiver, cellular systems. 

I. Introduction 

Two-way relay (TWR) techniques have attracted consider- 
able interest owing to its high spectral efficiency. Most of 
prior works study TWR systems with single user pair, where 
two users exchange information via a single relay station (RS) 
H]-G|. Various transmission schemes have been proposed for 
single antenna nodes [T| and multi-antenna nodes J2]], (3J- 

Recently, the design for TWR systems is extended to multi- 
user cases Bl- lfTTI . which can be roughly divided into two 
categories based on the system topologies, i.e., symmetric and 
asymmenic systems. In symmetric systems flU-JH], multiple 
user pairs exchange information via a RS. In asymmetric 
systems, a base station (BS) exchanges messages with multiple 
users 171- lfTTl . which is a typical scenario of cellular networks. 

In this paper, we study multi-user TWR cellular system, 
where a multi-antenna BS communicates with multiple single- 
antenna users bidirectionally via a multi-antenna amplify-and- 
forward (AF) relay. Owing to the importance from practical 
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perspective, there is a considerable amount of work on de- 
signing transceivers for such a system lf8l — ffTTI . However, its 
transceiver optimization is challenging due to the complicated 
interference among multiple users in the broadcast and multi- 
access phases, and even its bidirectional sum capacity is still 
not available until now. 

Allocating orthogonal time or frequency resources to the 
uplink and downlink signals of different users is an immediate 
way to eliminate the interference Q, with which existing 
single-user TWR techniques can be directly applied. Since this 
is far from optimal, a further attempt is to introduce an inter- 
ference free constraint, which is essentially the zero-forcing 
(ZF) principle. Though also suboptimal in a sense of sum rate, 
such a design can capture the inherent degrees of freedom 
of the system, which is an approximate characterization of 
the capacity at the high signal-to-noise (SNR) level. Along 
this line, several ZF-principle based transceivers have been 
proposed. Considering that the RS is equipped with multiple 
antennas, a natural solution is to apply ZF transceiver at the 
RS to separate all the signals from and to the BS and users 
0. This ZF scheme employs orthogonal spatial resources to 
differentiate different links, thereby the RS should be equipped 
with enough antennas. To remove all the interference, at least 
2N antennas are required at the RS for a system with N 
antennas at the BS and N single antenna users. When the RS 
is only with N antennas, the multiple antennas at the BS also 
need to be exploited to ensure interference free transmission. 
In flU-flj], the concept of signal alignment (SA) H2) is 
employed to reduce the number of interference experienced 
at the relay. The SA scheme exploits the self-interference 
cancelation (SIC) fPUl ability of TWR. Its basic idea is to 
project the uplink and downlink signals of each user onto the 
same spatial direction at the RS through proper BS precoding, 
such that the RS can separate N superimposed signals. After 
receiving a superimposed signal forwarded by the RS, each 
user removes its transmitted uplink signal via SIC, and obtains 
its desired downlink signal. 

Both the ZF and SA schemes are based on ZF-principle. 
Nonetheless, they are not the only interference free solutiorQ. 
In fact, by analyzing the feasibility of interference free con- 
straints for multi-user multi-antenna TWR cellular systems, it 
is not hard to show that the SA scheme is the unique solution 

'By using the terminology "interference free solutions", we refer to the 
transmit strategies that can remove all interference. These solutions include 
the ZF beamforming and ZF detector, the SA scheme, as well as the transmit 
schemes using orthogonal frequency or time resources, which can null the 
interference thoroughly. 
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only for special antenna configurations, and the ZF scheme 
ensures interference free transmission only when the number 
of antennas at the RS is sufficiently large. Moreover, both of 
them are designed as low complexity schemes without taking 
into account the sum rate. 

In this paper, we strive to find a low complexity interference 
free transceiver towards maximizing sum rate under general 
antenna settings. To provide a performance benchmark as well 
as useful insight into the transceiver structure, we employ a 
standard alternating optimization technique ffT4l to optimize 
the BS and RS transceivers aiming at maximizing bidirec- 
tional sum rate under interference free constraints. In order to 
develop a low complexity transceiver scheme, we fix the BS 
transceiver as the optimal BS precoder and detector in high 
power region found from the alternating optimization. Based 
on which we first optimize the RS transceiver to separately 
maximize the uplink and downlink sum rates and then balance 
the uplink and downlink sum rates to maximize the bidirec- 
tional sum rate. Simulation results show that the balanced 
scheme performs very close to the alternating optimization 
solution, and outperforms existing ZF and SA schemes under 
various scenarios. 

The rest of the paper is organized as follows. Section 
HTl describes the system model. Section [Til] introduces the 
alternating optimization solution. The balanced transceiver 
scheme is proposed in [IV] Simulation results are given in 
section [V] and conclusions are drawn in section [VI] The major 
symbols used in the paper are summarized in Table [I] 

II. System Model 

We consider a multi-user multi-antenna TWR system, which 
consists of a BS equipped with Nb antennas, a RS equipped 
with Nr antennas and Nu single-antenna users. The BS and 
multiple users exchange downlink and uplink information via 
the RS, as shown in Fig. [T] The bidirectional transmission 
takes place in two phases. 




Fig. 1 . System model of the multi-user multi-antenna TWR cellular system 

At the first phase, both the BS and multiple users transmit 
to the RS. The received signal at the RS is given by 

y r = H^WbtXfe + V^/H ur x„ + n r , (1) 

where H(, r 6 C n r xN b j s me channel matrix from the BS to 
the RS, H nr = (h lr , ■ • • , h Nur ), h ir G C NrX1 is the channel 
vector from the ith user to the RS, and x„ are the downlink 
and uplink signal vectors to and from Nu users and we assume 
E(xbX^) = E(x u x^) = 1n v , Pu is the transmit power of 



TABLE I 
List of important symbols 



N B , N R , N v 


BS or RS antenna number or user number 




Channel matrix from the BS or from all users to 
the RS 




Channel vector from the ith user to the RS 




Channel matrix from all users other than the «th 
user to the RS. 

It is obtained from H ur with the ith column, h ir , 
being removed. 


W M , Wbr 


BS transmit or receive weighting matrix 




The ith column of W(, t or Wj, r 


W r 


RS weighting matrix 


x b 


Downlink signal vector transmitted by the BS 




Uplink signal vector transmitted by all users 


yr 


RS's received signal vector in first phase 


y b i Vui 


BS's or the ith user's received signal in second 
phase 


Pb, Pr, Pu 


The transmit power of BS or RS or a single user 


No 


Noise variance 


Ru,R D , Rs 


Uplink or downlink or bidirectional sum rate 


In 


Identity matrix of size N 


(•) T , (•)". (•)• 


Transpose, conjugate transpose or conjugate of a 
matrix 


INI. (•)* 


Norm or pseudo inverse of a matrix 


S X (X) 


Orthogonal subspace of matrix X 

S X (X) = I - X H (XX H )- 1 X if X is a wide 
matrix 

S±(X) = I - X(X H X)- 1 X H if X is a high 
matrix 


diag(m) 


Diagonal matrix whose diagonal elements are the 
elements of vector m 


£(•) 


Mean value of a random variable 



each user, n r is the Gaussian noise vector at the RS with zero 
mean and covariance matrix NqIn r , an d Wftt G i^NbxNu 
is the precoder matrix at the BS, which satisfies the transmit 
power constraint as follows 

||W bt || 2 <P B , (2) 

where Pb is the maximal transmit power of the BS. 

At the second phase, the RS precodes its received signals 
and then broadcasts them to the BS and users. The received 
signals at the BS and the ith user are respectively given by 

y 6 =W£(H£W r y r + n fe ), (3) 
y ui = h^W r y r + n ui , (1 < i < N v ), (4) 

where W r G £,NrxN r j s t jj e we jghting matrix at the RS, 
Wj r G <C NnxNu is the receive weighting matrix at the BS, 
and rib and n U i are Gaussian noises at the BS and the ith user, 
each with zero mean and variance Nq. 

The RS weighting matrix should satisfy the transmit power 
constraint E(\\ W r y r || 2 ) < Pr, which can be rewritten as 
follows after substituting (fl3, 

||W r H frr W bt || 2 + PcrllW.H^H 2 + iVo||W r || 2 < P R , (5) 
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where Pr is the maximal transmit power of the RS0 

All channels are assumed independent quasi-static flat fad- 
ing and we consider time division duplexing for simplicity, 
hence the channels in the 1st and 2nd phases are assumed 
reciprocal. We assume that the BS and RS have global channel 
information of all links as in ll9l- lfTTl . 

III. Transceiver Design Based on Alternating 
Optimization 

Even after we introduce the interference free constraints, 
the problem of jointly optimizing BS and RS transceivers 
that maximizes the bidirectional sum rate of multi-user multi- 
antenna TWR systems is still non-convex and is very hard 
to deal with. In this section, we employ a standard tool, 
alternating optimization |14|, to solve the optimization prob- 
lem, which can serve as a performance benchmark for the 
interference free transceivers. 

Substituting (Q]i into OJ, the received signal at the BS can 
be rewritten as 



y b =W£H£W r H 6r W w x 6 



br br 

W£.H£.W r n r 



W b ,,n h 



(6) 



where the first term is the transmitted signal of the BS in the 
first phase which can be removed by SIC, the second term 
is the desired uplink signal, and the last two terms are the 
noise amplified by the RS and the noise at the BS receiver, 
respectively. 

To eliminate the interference among the uplink signals, the 
following constraint should be satisfied, 

wLCW r h ir = 0, ijtj, (7) 

where Wbri is the ith column of W& r . 

Substituting (Q]) into (01, the received signals at the ith user 
can be rewritten as 

Vui =h^W r H i)r W6 t x 6 + ^J~P^\i[ r W r B. ur yi u + h^W r n r 

+ n ul (l<i<Njj), (8) 

where the first term consists of the downlink signals for all 
Njj users, the second term consists of the transmitted signals 
from Njj users in the first phase, and the last two terms are 
noises. 

To remove the interference, the BS and RS transceivers 
should satisfy the following constraints, 

hf r W r H 6r w Wj = 0, i ^ j, (9) 

h£W r h ir = 0, i + j, (10) 

where W(, fJ is the jth column of W^. 

Considering the inter-user interference (IUI) free constraints 
©, (0 and ( [Tol l and the fact that the self-interference can be 
canceled 0131 . the receive SNR of the ith uplink and downlink 
signal can be respectively obtained as, 



SNR m = 



SNRm = 



JVo||w^H£.W r ||a + JVo|K 

Ar ||hf r w r || 2 + iV 



(11) 



Then the bidirectional sum rate of the TWR system is^, 

N V 

Rs = Ru + Rd = ^^(Rm + RDi) 

i=l 

Nu 

= E k log2(1 + SNR Ui) + j lo S2(l + SNR Di )] , 
»=i 

(12) 

where Rjj and Rjj denote the uplink and downlink sum rate, 
Rjji and Roi are the uplink and downlink data rates of the 
ith user, and the pre-log factor 1/2 is due to the half-duplex 
constraint. 

In the following, we optimize one of the three transceiver 
matrices by fixing the other two. 

A. Optimization of Weighting Matrix W r of RS 

Here we fix Wj t and Wj r , and optimize W r to maximize 
the bidirectional sum rate under the RS transmit power con- 
straint and the IUTfree constraints by solving the following 
problem, 



max Rq 
s.t. ©, ©, © and OH). 



(13a) 
(13b) 



The bidirectional sum rate R$ is not a convex function of 
W r . To solve this non-convex problem and find the maximum 
Rs, we employ the concept of rate profile, which is introduced 
in lfT31 to characterize the boundary rate-tuples of a capacity 
region. We introduce a vector (3 = (f3i, ■ ■ ■ , @2Nu) to specify 
the rate profile, where Yli=i Pi = 1 an£ l A — 0. Then by 
solving the following optimization problem, 



max Rq 



(14a) 



2 We do not consider power control at the BS and RS. The inequality power 
constraints are for simplifying the optimization. 



s.t. Rjji > p t R Sl R Di > fc+NuRs, 1<i<Njj, (14b) 
©, 0, © and (O, (14c) 

we will achieve a boundary point of the achievable rate region 
specified by each vector /3. After searching the optimal /3 
from all its possible values, we can find the optimal boundary 
point corresponding to the maximum sum rate. For multi-user 
case, it is too complicated to search all possible /3. To reduce 
the complexity, we use bisection algorithm |fl6l to search the 
optimal /? as in (8). Although it is hard to rigorously prove 
that the achievable rate region boundary is a convex hull in 
terms of /?, simulation results show that bisection algorithm 
offers the same result as that of using brute-force searching. 

To solve the problem ( TBI , we apply a similar approach as in 
J31 to convert the optimization problem (fT4l to a semidefinite 
programming (SDP) problem with a rank-1 constraint, and 
then we resort to the widely used semidefinite relaxation |fP71 
to handle the problem. 

3 The received non-white noise after amplifying and forwarding is treated 
as white noise as in existing literature. This is in fact the worst case of the 
problem, therefore the data rate obtained by log 2 (l + SNR) can serve as a 
lower bound. 
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B. Optimization of Transmit Weighting Matrix Wfa of BS 

In this subsection, we design W&t given Wfc r and W r . 
Since the transmit weighting matrix of the BS only affects 
downlink rate when Wj r and W r are fixed, we design it 
to maximize the downlink sum rate Rd- The design of W&t 
should consider the IUI free constraint (0 and the BS transmit 
power constraint (0. It is also associated with the RS transmit 
power constraint (0. Then the optimization problem can be 
formulated as 



max Rd 
s.t. (0, © and (0. 



(15a) 
(15b) 



This is also a non-convex problem, which can be solved 
by the same method as that we used to solve^problem dot . 
Define a vector B = • ■ • , P Nu \, where J2i=i A = 1 an( l 
Pi > 0. The solution of (TT3T > can be found from solving the 
following problem by searching the optimal 8, 

max Rd (16a) 

s.t. R Dl > fcRo, l<i<N v , (16b) 
©, © and ©. (16c) 

Each of the BS and RS power constraints (0 and (0 
imposes a constraint on the norm of a linear function of 
Wfet. The IUI free constraint © is a linear constraint on 
W;,t. According to ifTTl , the rate tuple constraint (1 16b) can 
be converted to linear constraints on Wjt. Therefore the 
constraints in (IT~6b form a second-order-cone feasible region 
(T7), and the size of the feasible region depends on Rr>. 
Consequently, we can solve ( fT6b by searching the maximal 
Rd that guarantees a non-empty feasible region. Bisection 
method is applied to search Rd ■ We use the C VX tool lfT8l to 
check whether the feasible region is empty or not. If it is not 
empty, the CVX tool will return a value of W&t in the feasible 
region. Finally, we will obtain both the maximum value of Rd 
and the optimal Wn. 

C. Optimization of Receive Weighting Matrix W&r °f BS 

Given W& t and W r , only affects uplink sum rate. 

Among the three IUTfree constraints, W&r is only associate 
with 0. Therefore, the optimization problem can be formu- 
lated as 

max Ru 



s.t. w^H^WrV = 0, i ^ j. 



(17) 



According to (fTTT i and (fTZt , the data rate of each uplink 
stream, Rjji, is only a function of wj n ;. Therefore, this 
problem can be decoupled into Njj subproblems. Since Rm is 
a monotonic increasing function of SNRjji, each subproblem 
can be formulated as 



max SNRm 

Win 

S.t. W^H^WrV 



0, j^i. 



(18a) 
(18b) 



Any feasible Wj n should satisfy w^, i H^,W r H !r = 0, 
where H , 



column being removed. Define as a matrix consisting of 
all the singular vectors of H^ r W r H ir corresponding to its 
zero singular values. Then we have 



Wfc T 



- U^x 



(19) 



where x is an arbitrary vector. 

Rewrite the expression of SNR m in (Q3} as follows, 



SNRui = 



(PuUlW r h lr h^ HI) w 



bri 



w£ i (iVoH£.W r W?HJ r + N 1 Nb )w 



bri 



' bri 



bri 



wl„K IN w* br 



(20) 



where K s 4 J^H^WAW H 6r and K in = 
N UT r W r WfH.* br + N I NB . 

By substituting ( fT9b and ( f20b . the optimization problem (fT8l 
becomes 



max 



x r Ui T K /JV U,i*x 



(21) 



which is a generalized Rayleigh ratio problem. The optimal x 



J-LT 



corre- 



is the eigenvector of U„ J K s U^.*(U i ( 
sponding to its largest eigenvalue |fl9l . 

By now, we have solved the three problems ( fT3l i, ( fT31 and 
( fTTI i. When we find the alternating optimization solution, we 
need to assign initial values for the transceiver matrices, which 
should satisfy all the IUI free constraints and the transmit 
power constraints. The initial values are set according to the 
following procedure. 

First, constraint ( fTOb can be rewritten as a group of linear 
equations of W r as (hj r <g> h^.)vec(W r ) = 0, i ^ j, where 
® denotes Kronecker product, and vec(-) is the vectorization 
of a matrix by stacking its columns. The general solution of 
this equation is given by 



vec(W r ) = S ± (K )x, 



(22) 



where Ko is the matrix by stacking all hJ r ®h.J r , i ^ j, Sj_(-) 
is the orthogonal subspace of a matrix, and x is an arbitrary 
vector. 

We pick one W r from the general solution. Then we sub- 
stitute the chosen W r into (0 and 0, find general solutions 
of these two set of equations similar to (T22l . and pick one 
Wbt and one Wj, r among the general solutions. Finally, we 
multiply Wfet and W r with proper scalars to satisfy the BS 
and RS power constraints. 

After assigning the initial values, we alternately optimize 
one of the three transceiver matrices by fixing the other two. 
The sum rate must increase with each iteration, otherwise, the 
iteration is terminated. Due to this requirement, the alternating 
procedure will surely converge. Because of the non-convex 
nature of the optimization problem, the converged solution is 
not guaranteed to be globally optimal, and depends on the 
initial values. Nevertheless, we can increase the probability 
to achieve the maximal bidirectional sum rate by repeating 
the alternating optimization procedure with multiple random 
initial values then picking the best solution. 
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IV. Balanced Transceivers 

In this section, we design a low complexity transceiver 
toward achieving maximal bidirectional sum rate under the 
interference free constraints. To this end, we decouple the 
joint optimization of the BS and RS transceivers resorting 
to the asymptotic analysis in high power region. Specifically, 
we first find the BS precoder and detector from analyzing 
asymptotic results of the alternating optimization solution. 
Then we optimize the RS transceiver based on the given BS 
transceiver, also in high power region. 

A. BS Transceivers 

1) BS Precoder: To obtain a closed form solution, we 
consider an asymptotic region where the transmit power of 
RS goes to infinity. When Pr — > oo, the RS transmit power 
constraint can be ignored, then the BS precoder optimization 
problem in (fT3T > can be reformulated as 



w 



b I 



i=l 



\hf r w r n br vf bu \ 2 

JVoP&WJP+JV, 



■) 



s.t. \\Wbt\r < pb 



h ir W r H br w btJ = 0, i ± j, 



(23) 



which can be viewed as linear precoder optimization for down- 
link multi-user multi-antenna system with a channel matrix 
H^ r W r Hj r that maximizes the sum rate under interference 
free constraints and total transmit power constraint. According 
to 11201 . the optimal precoder is a ZF precoder with proper 
power allocation, i.e., 



W 6t = (H£ r W r H 6r )tG 6 , 



(24) 



where G b is a diagonal power allocation matrix. For simplic 
ity, we consider equal power allocation at the BS, i.e., 

2 



Pb/N; 



(25) 



2) BS Detector: To obtain a closed form detector, we 
consider another asymptotic region where the transmit power 
of the BS or users approaches infinity. When Pjj — > 00 or 
Pb — > 00, the received SNR at the RS in the first phase goes 
to infinity, then the RS forwarded noise can be neglectecQ. 
In this case, K/^r in (I2TI 1 is NqIn b . By solving the problem 
(I2TI 1 and applying ([l9l i. the optimal BS receiver vector w br i 
can be obtained as U^U£ ff (H£W r h ir _)% where U£U£ ff 
spans the orthogonal subspace of H^,W, H,> |[T9l . Therefore, 
the optimal w br i is the projection of H^,W r hi r onto the 
orthogonal subspace of H^,W r Hi r , i.e., the optimal solution 
is a ZF receiver for the equivalent uplink channel H^.W r H ur , 
i.e., 



W br = [(H b r r W r H ur )t] T . 



(26) 



Note that the obtained BS precoder and detector in (l24l 
and d26l i are not optimal for practical systems with finite 
transmit power. Nonetheless, later we will show by simulations 
that these ZF transceivers perform fairly well even when the 
transmit powers are finite. 



B. RS Transceiver 

Now we find the solution of RS transceiver from (TTjt given 
the BS transceivers (l24l and (|26V The IUI free constraints (O 
and © are satisfied owing to the usage of ZF transceivers at 
the BS, and thus can be removed. Note that we consider equal 
power allocation in the BS precoder, then the optimization 
problem of RS transceiver can be reformulated as 



max Rq 
w, 

s.t. ©, GO), (E3k (ES and 



(27a) 
(27b) 



To find a low complexity solution for this non-convex prob- 
lem, we decouple it into two subproblems, which respectively 
maximize the uplink and downlink sum rate. Then we combine 
these two solutions to maximize the bidirectional sum rate. 

When the transmit power of each user goes to zero, i.e., 
Pjj dE the system uplink sum rate will approach to zero, 
then Rs —> Rd- From (JTTJ and ( fT2l . the downlink sum rate 
Rd does not depend on W& r , therefore the constraint (l26l 
in problem d2Tb can be removed. Moreover, in this case the 
RS received signal at the first phase y r — > H br MV bt x b + 
n r . Then the RS power constraint (0 can be rewritten as 
||W r H far W M || 2 + iVo||W r || 2 < P R . Consequently, the prob- 
lem d27l i reduces to the following problem that maximizes the 
downlink sum rate, 



max Rd 



s.t. (QjjJ, CS), (|25) and 



|W r H 6r W 



'bt 



N \\W r \\ 2 <P R . 



(28) 



Similarly, when the BS transmit power goes to zero, i.e., 
Pb — > 0, the problem d27l i reduces to the following problem 
that maximizes the uplink sum rate, 



max Rjj 



s.t. ([Toll, (ED and 

Pr/IIW r H nr || 2 



VVo||W r || 2 <P R . 



(29) 



We will first solve these two subproblems, then combine 
the two solutions of W r to balance the uplink and downlink 
rates, so as to maximize the bidirectional sum rate. 

1) Design of W r From Subproblem rfZSl ) : We can show 
that the optimal solution of d28l i has the following structure 
(see Appendix), 



W r = (H^)t GrlU T ; 



(30) 



where G r i is a diagonal matrix and each column of U has 



unit norm, i.e. 



u, 



= 1. 



The optimal structure of W r can be intuitively explained as 
follows. When Prj — > 0, there is only downlink transmission, 
i.e., the RS receives signals from the BS and then forwards it 
to the users. In this case, (H^ r )t represents the ZF precoder 
at the RS to broadcast signals to the users, G r x is a power 
allocation matrix for different signal streams, and U T is the 
receive weighting matrix at the RS, which separates the Njj 
downlink signals from the BS, see Fig. El 



4 This is not true for the case of deep fading where the channel coefficient 
is approximately zero, but such a case is of low probability. 



5 This is not conflict with the optimality conditions of the ZF transceivers 
at the BS, which are Pr — ¥ 00 and either Pg or Pjj — ¥ 00, 
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Fig. 2. Structure of the optimal RS transceiver for downlink transmission 



To obtain the power allocation matrix G r i = 
diag(p r i, ■ • • ,PrN v )> we simply let the amplification 
coefficients at the RS for all streams to be identical. Denote 



Since each downlink data 



each column of (H^.)^ as q, 
stream is received by uf , amplified by p ri , and forwarded by 
we design p r j to ensure that each p r ,;qjuf has the same 
norm. 

Our next task is to design the receive weighting matrix U. 
Upon substituting (l30l l, the optimization problem (l28t can be 
rewritten as 



1 Nu 



&2 I 1 



p^\u[H br w bti 



(31a) 



NoP 2 „ + N Q > 

s.t. ufV = 0, i^j, \\ui\\ = 1, (31b) 

W w = (U T H 6p ) t G- 1 1 G 6) (31c) 

||w 6 ti]| 2 = Pb/Nu, (31d) 

||(H^)t GrlU T H6rW6t ||2 + 7V ||(H^,)t GrlU T||2 

<Pr. (31e) 

This problem is non-convex, thereby we turn to find its 
suboptimal solution. Since U T acts as the receiver at the RS 
in downlink transmission, we design it to maximize the "data 
rate" of BS-RS transmission instead of the two-phase downlink 
transmission^]. 

During the BS-RS transmission, the half-duplex RS only 
receives signals from the BS. Therefore, we do not consider 
the RS transmit power constraint (13 let , which can be met 
later by adjusting G r i. Then the BS-RS transmission rate 
maximization problem is formulated as 

Nu 

max ^log 2 (l + |ufH 6r w bt4 | 2 /^ ) (32a) 



i=i 



s.t. (l3Tbl . (13151) and OTdl . 



(32b) 



Remark 1: If Pjj —> oo, the objective function (131 at will be 
the same as (132at except for the pre-log factor 1/2, and the 
RS power constraint (13 let can be omitted. This means that 
the two optimization problems are approximately equivalent 
when the RS has high transmit power. 

Constraint (13 let shows that W bt is a pseudo inverse of 
U T H(, r with power allocation. Define Uj as the matrix U with 
the ith column being removed. Then using the principle of 



orthogonal projection |[T9l . we obtain that 

|ufH 6r W( rti |/||w 6ti || 
=]|uT H 6r (I - Hf r U*(Tlf H 6r H£U*)- 1 Uf H far ) 



Substituting this expression and d31dt into d32at , then the 
problem (l32l can be rewritten as 



max ^log 2 (l 



"ufH 6r S ± (ufH, r )|| 2 ) 



NuN, 



o 



i=l 

S.t. ujh jr = 0, i ^ j, \\Ui 



(33) 



Constraints d31ct and (13 1 db are omitted since the objective 
function does not rely on Wjt now. 

Solving problem d33l is nontrivial because we need to 
jointly design all u^. To obtain a low-complexity solution, we 
employ alternating optimization [14] again. We first initialize 
U = 0. Then we alternately optimize each of the A'y columns 
of U. In each step, we optimize the ith column by solving 
the problem d33l with all other columns U, being fixed. After 
each step, we renew the matrix U by replacing its ith column 
by the optimized U;. The procedure stops when the value 
of objective function in (l33l does not increase any more. 
Simulations show that the procedure always converges after 
each of the Njj columns has been optimized once. 

In the above procedure, we need to solve the optimization 
problem (l33l l with fixed Uj. Note that the constraint uf h jr = 
0, i 7^ j can be rewritten as uj Hj r = 0. Any feasible u; must 
lie in the orthogonal subspace of H; r . Therefore, we have 



U; = S_L(H ir )x, 



(34) 



where x is an arbitrary vector. Then the optimization problem 
d33l ) with fixed Uj can be rewritten as follows by substituting 



max ||x T S ± (H lr ) T H bl .^(ufH br ) 

X 

s.t. 11x11 = 1. 



(35) 



The optimal value of x is the left singular vector of 
S^(Hi r ) T Hf )r Sj^(U i Hfc r ) corresponding to its largest sin- 
gular value |[T9l . Then from (l34t , we can obtain the optimal 

Ui. 

Substituting the optimization result U* into d30l l. the RS 
weighting matrix designed for maximizing the downlink sum 
rate can be obtained as W* 1 = (H£ r ) t G r iU* T . 

2) Design of W r From Subproblem ( I29D : We can also 
show that the optimal RS weighting matrix that maximizes 
the uplink sum rate has the following structure, 



U*G r2 Ht 



(36) 



Since the RS does not decode message in AF protocol, in fact there is no 
"BS-RS transmission data rate". We use this terminology here for simplifying 
the optimization problem. 



where U* and G r 2 can be obtained similarly as in the last 
subsection. We do not present the detailed derivation for 
concision. 
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TABLE II 

Comparison of Computational Complexity 





RS 


BS 




major operations 


complexity 


major operations 


complexity 


ZF scheme 


pseudo inverse of a Nr x 2Njj 
matrix 


0{NrN v ) 


none 





SA scheme 


pseudo inverse of a Nr x Nu 
matrix 


o(n r n v ) 


pseudo inverse of a Nr X Ay 
matrix 


O(NrN^) 


Balanced scheme 


pseudo inverse of a Nr x Ay 
matrix, 

Ny times of SVD of N R X 
(A s - Ay + 1) matrix 


o{n r n v )+ 

0(NrNij{Nr — Njj + l) 2 ) 


pseudo inverse of Nr x Ay 
matrices 


0(NrN u ) 



3) Balancing Rjj and Rd to Maximize Bidirectional Sum 
Rate: Consider that bidirectional sum rate R$ = Ru + Rd, 
while W* 1 and W* 2 are respectively optimized for Rjj and 
Rd- To improve R$, we propose the following RS weighting 
matrix, 

W r Bi = c-yfrW* 1 + (1 - 7 )W* 2 ), (37) 

where the power adjusting factor 7, < 7 < 1, is used for 
controlling the power proportion to W* 1 and W* 2 to balance 
the uplink and downlink sum rate, c 7 is used to meet the 
total RS transmit power constraint. In practical systems, after 
obtaining W* 1 and W* 2 , the RS can search for an optimal 7 
that maximizes the bidirectional sum rate. 

Remark 2: In a TWR system with single user and single- 
antenna BS, W ? Bi turns out to be a maximal-ratio combina- 
tion and maximal-ratio transmission (MRC-MRT) weighting 
matrix. It is shown in J2) that the bidirectional sum rate 
gap between MRC-MRT and the optimal scheme is no more 
than 0.2bps/Hz. Though in multi-user multi-antenna TWR 
system, we cannot draw the same conclusion via rigorously 
analysis, the simulation results in section [V] will show that 
such a balanced solution performs closely to the alternating 
optimization solution. 

Remark 3: By substituting W* 1 and W* 2 into d3Tb . we 
have 

W r Si = c 7 ( 7 (HC)t GrlU +T + (1 _ 7 )U*G r2 Ht r ). (38) 

As shown in d34l . each column of matrix U* lies in the 
orthogonal subspace of Hj r , i.e., u* e Sj_(Hj r ). The ith 
column of (H^ r )' also lies in that orthogonal subspace, i.e., 
q. t G S_L(H ir ). 

When Nr = Nu, i.e., the number of RS antennas equals to 
the number of users, the matrix Hi r is a Nr x (Nr — 1) matrix. 
Therefore, the rank of its orthogonal subspace Sj_(Hi r ) is 
one. Since both of u* and lie in the same rank-1 sub- 
space, they are linearly dependent, i.e., u* = <iq,, where 
di is a scalar. Therefore, we have U* = (Hj r )tD, where 
D = diag(o?i,-- - ,dN lT )- Substituting this expression into 
1381 . we obtain 

W r Bi = (H^)t (c77GrlD T + C7 (i _ 7 )DG r2 )Ht r 

4 (H^,)t G BL H t r; (39) 



where G^ L = c 77 G r iD T + c 7 (l — 7)DG r 2 is a diagonal 
matrix. 

Comparing ( |39] l with the RS transceiver in the SA scheme 
proposed in i9l- lfTTI . we see that W^ L has the same form as 
that of the SA scheme. Substituting ([39]) into (|24|) and 126b . 
it is easy to show that the BS transceivers in our balanced 
solution also have the same forms as those in the SA scheme. 
This indicates that the SA scheme is a special case of the 
balanced solution when Nr = Njj- In fact, in such a setting, 
it is not hard to show that the solution of interference free 
constraints ((TJ, (O and (TTOb is unique, which is exactly the 
SA scheme. 



C. Complexity Comparison 

Here we compare the computational complexities of the 
balanced scheme and the existing ZF |8| and SA schemes 
0-1111. 

In the ZF scheme, the major operation at the RS is to 
compute the pseudo inverse of a Nr x 2Njj matrix. Since 
all the interference are eliminated by the RS, the BS needs to 
do nothing. In the SA scheme, the major operations at the RS 
and the BS are to compute the pseudo inverses of a Nr x Nu 
matrix and a Nr x Nu matrix, respectively. 

In the balanced scheme, to obtain the RS transceiver (l37T i. 
we need to compute Hf ir , U, G r i and G r 2, and search 
for the optimal balancing factor 7. Specifically, we need to 
perform Nu times of singular vector decomposition (SVD) to 
alternately design the Nu columns of U. According to ( T3~5l >. 
each SVD is performed for a Nr x (Nr — Nu + 1) matrix. 
Only vector norm operation is required to compute the power 
allocation matrices G r i and G r 2, for which the complexity 
can be neglected compared with those of the pseudo inverse 
and SVD. The complexity of finding the optimal 7 can also 
be ignored, which only requires a scalar searching operation. 
To obtain the BS transceivers in our balanced scheme (l24t and 
(|26t . we need to compute the pseudo inverses of two Nr x Nu 
matrices. 

A widely used method for computing pseudo inverse is 
using SVD, which results in a complexity of 0(mn 2 ) flops 
to compute the pseudo inverse of a m x 71 matrix IF2TI . where 
m > n. The complexities of the transceiver schemes are 
present in Table [TTJ which shows that the complexity of the 



8 



balanced scheme is on the same order as those of the SA and 
ZF schemes. 

V. Simulation Results 

In this section, we evaluate the performance of the proposed 
transceivers and compare them with existing schemes by 
simulations. We assume that all channels are independent 
and identically distributed Rayleigh fading channels, and all 
simulation results are obtained by averaging over 1000 Monte- 
Carlo trails. For a fair comparison, we use equal power 
allocation at the BS and RS in all the transceiver schemes. We 
assume that the noise variance Nq is identical at the BS, RS 
and each user. The transmit power of each user Pjj = 1. The 
BS and RS transmit power are Pb and Pr, respectively. We 
define 1/iVo as the transmit SNR. Without otherwise specified, 
we set N B = 2, N R = 4, N v = % Pb = Pr = % and 
SNR = 30 dB. 

A. Impact of the Adjusting factor 

The sum rates of balanced scheme versus the power adjust- 
ing factor 7 are shown in Fig. [3] where the upper sub-figure 
shows the uplink and downlink sum rates and the lower sub- 
figure shows the bidirectional sum rate. When 7 = 0, the RS 
weighting matrix W r = W* 2 , which aims to maximize the 
uplink sum rate. Therefore, the system achieves high uplink 
sum rate but low downlink sum rate in this case. By contrast, 
when 7 = 1, the system achieves high downlink rate but 
low uplink rate. By adjusting the value of 7, the uplink and 
downlink performance are balanced and higher bidirectional 
sum rate is achieved. The optimal 7 under this case is 0.5. 



e o o 




Fig. 3. Sum rates vs. the power adjusting factor 7, Nb = 2, Nb = 4, and 

N V =2 



B. Convergence of the Alternating Optimization Solution 

To study the convergence of the alternating optimiza- 
tion algorithm, we respectively use the proposed balanced 
transceiver, the ZF and SA transceivers and multiple random 
weighting matrices as its initial value. When using random 



matrices as the initial values, we pick one from multiple results 
that converges to the highest sum rate. 

Figure@]shows the bidirectional sum rate versus the iteration 
number. The sum rate converges rapidly but the converged 
result depends on the initial values due to the non-convexity 
nature of the optimization problem. Nonetheless, by using 
multiple random initial values, higher bidirectional sum rate 
can be achieved. We observe from extensive simulations that 
when the number of random initial values exceeds 20, the 
performance gain is marginal. Therefore, we can take the result 
with 20 random initial values as a near-optimal result. It is 
shown that the performance of the balanced transceiver is very 
close to that of the near-optimal result. In the following, we 
will use the balanced transceiver as the initial value for the 
alternating optimization. 




y a 



'►' 1 20 random 
■*-BL 
■ * - ZF 
< ■ - SA 

A 1 Single random 



2 3 
Iteration number 



Fig. 4. Convergence of the alternating optimization algorithm with different 
initial values, Nb = 2, Nr = 4, and Nu =2 



C. Comparison among Different Transceivers 

We compare the bidirectional sum rates of alternating opti- 
mization solution and the balanced transceiver with those of 
the ZF [H and SA schemes l9l- lfT71l . We also compare with a 
minimum-mean-square-error (MMSE) transceiver without the 
interference free constraints, where the MMSE BS transceiver 
and MMSE RS transceiver were alternately optimized lf22l . 

Figure [5] shows the impact of the antenna number of the RS, 
where 'Al-Opt" denotes the alternating optimization solution. 
When there are two users, the ZF scheme needs at least 4 
antennas at the RS to cancel all the interference, while the SA 
scheme only needs 2 antennas. From the simulation results, we 
see that when Nr < 4 the sum rate of the ZF scheme reduces 
sharply due to the residual IUI, but the SA scheme performs 
much better. When Nr > 4, the ZF scheme becomes superior 
because it can remove all IUI but the SA scheme suffers from a 
power loss when aligning the downlink signals with the uplink 
signals. The sum rate of the balanced transceiver is close to 
that of the alternating optimization solution, both are higher 
than the existing ZF and SA schemes for any antenna number 
at the RS. 

Figure [6] shows the impact of the user number on the 
performance of different transceivers. We set Nb — Nr = 4, 
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Fig. 5. Sum rates of four transceivers vs. RS antenna number, Nb = 2, and 

Nu =2 



and Pb = Pr = 4. Round robin scheduler is applied, where 
the scheduled user number Njj is from 1 to 4. It shows that 
the performance of the ZF scheme degrades severely when 
Njj > 2 because the four-antenna RS can not cancel all 
IUI. With the SA scheme, the proposed balanced scheme and 
the alternating optimization solution, the system achieves the 
highest bidirectional sum rate when three users are scheduled, 
where both the balanced scheme and alternating optimization 
result have about 2bps/Hz sum rate gain over the SA scheme. 
When Njj — 4, we see that the performance of the balanced 
transceiver and the SA scheme are exactly the same. This 
agrees well with our earlier analysis in Remark 3. 



s 

E 15 
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Sum rate of four transceivers vs. user number, Nb = 4, and Nr 



In Fig. [7] we compare the sum rate of the interference free 
transceiver schemes with the MMSE transceiver El . We can 
see that our balanced scheme provides higher sum rate than the 
existing ZF and SA schemes in a wide range of transmit SNR. 
The MMSE scheme is slightly superior to our balanced scheme 
in low SNR region, but is inferior to the proposed scheme 
in high SNR region. This is because the MMSE solution in 
l22l is obtained via alternating optimization, which is not 
guaranteed to be globally optimal. In high SNR region, the 



system is interference-limited, therefore the proposed scheme 
outperforms the MMSE solution by removing all the interfer- 
ence. 




Fig. 7. Sum rate vs. SNR, N B =2, N R = 4, and N v = 2. 

In Fig. [8] we provide the sum rate under each single 
channel realization to understand the behavior of the IUI free 
transceivers. We see that the ZF and SA schemes perform 
differently for a given channel. The ZF scheme requires the RS 
to separate all the signals transmitted by the users and BS, and 
performs well only when the channel vectors from the users 
and the BS are mutually orthogonal. Contrarily, the SA scheme 
needs to align the signals transmitted by the BS onto the same 
directions of the signals transmitted by the users, and thus 
performs well only when the channel vectors from the users 
and those from the BS have the same direction. Our balanced 
scheme can adaptively adjust transmission strategy depending 
on the channel condition to ensure IUI free without the 
requirements for channel "orthogonalization" or "alignment". 
Therefore, its sum rate is always higher than those of ZF and 
SA schemes. 




Channel Realizations 

Fig. 8. Sum rate under different channel realizations, Nb 
N v =2. 



2, N f , 



■ 4, and 



In Fig. [9] we compare the outage probabilities of the IUI free 
transceivers with 10 5 Monte-Carlo trails, where the system is 
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in outage if its bidirectional sum rate drops below a given 
threshold, which is set as 2bps/Hz. We see that our balanced 
scheme achieves much lower outage probability than both the 
ZF and SA schemes. Moreover, since the sum rate of the 
balanced scheme is always "riding on the peak" of the ZF and 
SA schemes as shown in Fig. [8] the balanced scheme achieves 
higher diversity gain, and its outage probability decreases 
much faster than those of the ZF and SA schemes as the SNR 
increases. 




+■ 


ZF 


- e 


-SA 





15 

SNR (dB) 



Fig. 9. Outage probability vs. SNR, N B =2, N R = 4, and N v = 2. 

VI. Conclusion 

In this paper, we have designed transceiver for multi- 
user multi-antenna two-way relay systems. We first employed 
alternating optimization to find the BS and RS transceivers 
that maximizes the bidirectional sum rate under interference 
free constraints. We proceeded to propose a low complexity 
balanced transceiver scheme. By analyzing the solution of the 
alternating optimization in high transmit power region, we find 
that zero-forcing BS transceivers are asymptotically optimal. 
Given the BS transceivers, we designed the RS transceivers 
by respectively maximizing the uplink and downlink rate, 
which are then combined with a power adjustment factor to 
maximize the bidirectional sum rate. Existing signal alignment 
scheme was shown as a special case of the balance scheme 
where the relay antenna number equals to the user number. 
Simulation results showed that the performance gap between 
the balanced scheme and the alternating optimization solution 
is minor. In general system settings, the bidirectional sum rate 
of the balanced transceiver is higher than the existing signal 
alignment and zero-forcing schemes. 

Appendix 

Proof of the Optimal Structure of W r in A30\) 

Define V ur € <C NrxNu as a matrix consisting of the Njj 
singular vectors of Hj r , and V^, E cNnxiNn-Nu) afj a 
matrix consisting of the Nr — Nu singular vectors of the 
orthogonal subspace of H^ r . Then Vp = [V„ r V^ r ] is a 
unitary matrix, and W r can be expressed as 

W r = V F VfW r = V ur A T + V^B T , (40) 



where A E C NrxNu , B E C NrX ( Nr - Nu ^> are two arbitrary 
matrices. 

Since hJ r V^ r = 0, from (fTTT i. the downlink sum rate Rn 
can be written as 



Rd= oE 10 ^ 1 



|h^.V Mr A T H b? .Wfc t8 ;| 2 
N \\hlV ur ATP + N, 



■)• 



(41) 



Substituting (gOjl into d24t . we have W bt = 
(H^ r V ur A T Hb r )^Gb, which is not a function of B. 
Therefore, the value of B does not affect the constraints (l24l 
and d25l l in problem d28t . According to ( flTt . the objective 
function Rr> of problem (TfST l also does not depend on B. 

Substituting (|40ll into (TO), we obtain h^.W r hj> = 
h^,V ur A T hj> = fl,i ^ j, which shows that the value of 
B does not affect the constraint (TT0T > either. 

We can show that the RS transmit power is minimized when 
B = as follows, 



■JVollV* 



ViB T H h) .W bt || 2 



|V m .A J H br W bt \\ A -t- || \ ur D n br 

r A T r+N \\v^ r B T r 



>Pn\\V„rA T H br W bt W 2 



, T\\2 



It indicates that for any given W. r = V ur A T + V,^ r B T , 
we can always find a W* = V ur A T , which achieves the 
same downlink rate Ro as that with W r but consumes less 
RS power. Therefore, the optimal W r for ( |28] i should has the 
structure of W r = V ur A T . 

Denote the singular value decomposition of as 
U ur D ur V^ r , where D„ r ,U ur are both non-singular matrix, 
then we have 



W— V A 1 — V 
r v ur v i 

= (H^)tU ur D. 



.(D 
A T = 



_1 tj h tj n 



)A 1 
ur y-^ 1 - 

(Hj r )tM T , 



where 
(mi,- 
G r i = 

A 

Prj = 



M T = U„ r D tlr A T . Divide the matrix M = 
■ ,mpf u ) into two matrices, U = (ui, • • • , ujv^) and 
= diag(p r i, • ■ ■ ,PrNu), where Uj = mj/\\mj\\ and 
|nij||. Finally, we have W r = (H^ r )+G r iU T . 
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